10 research outputs found
Non-hierarchical Structures: How to Model and Index Overlaps?
Overlap is a common phenomenon seen when structural components of a digital
object are neither disjoint nor nested inside each other. Overlapping
components resist reduction to a structural hierarchy, and tree-based indexing
and query processing techniques cannot be used for them. Our solution to this
data modeling problem is TGSA (Tree-like Graph for Structural Annotations), a
novel extension of the XML data model for non-hierarchical structures. We
introduce an algorithm for constructing TGSA from annotated documents; the
algorithm can efficiently process non-hierarchical structures and is associated
with formal proofs, ensuring that transformation of the document to the data
model is valid. To enable high performance query analysis in large data
repositories, we further introduce an extension of XML pre-post indexing for
non-hierarchical structures, which can process both reachability and
overlapping relationships.Comment: The paper has been accepted at the Balisage 2014 conferenc
Unified Class Evolution by Object-Oriented Views
Object-oriented databases are said to support evolution and incremental development. On the schema level, a firm restriction in this evolution is that it can only be done by evolving class hierarchies downwards by subclassing. We show a unified approach to class evolution in object-oriented databases, where class hierarchies are allowed to grow in all directions, covering for evolution situations like generalisation, specialisation, and class versioning. We show how to make the evolution transparent, allowing existing and new clients to coexist and be clients of the same (existing and new) objects. A design of this approach based on object-oriented database views is shown. 1 Introduction In most database applications there is a need for letting the schema evolve. There are several reasons for this. There may be design flaws that are not discovered before (some of the) applications are implemented and the database is populated. The domain being modelled is evolving, new applications ha..
Transparent Evolution and Integration of Classes in Object-Oriented Databases
Object-oriented databases have limited support for evolution at the schema level. This paper presents a framework for transparent class evolution and integration, which unifies existing evolution techniques, like subclassing, generalisation, class versioning, and integration by views. The transparency offered allows existing and new clients of classes to be unchanged upon evolution and integration of populated classes. This is done by separating the extensional and intensional dimensions of classes into two different hierarchies, and by having external constructs for describing interdependencies between the properties of the classes. 1 Introduction Databases are often exposed to evolution on the schema level. This may be a result of bad schema design, the domain being modelled is evolving, or that independently developed databases have to be integrated. Classes in object-oriented databases already provide for some evolution by the use of subclassing, which lets a class extend another ..
Improving the Performance of Pipelined Query Processing with Skipping
Web search engines need to provide high throughput and
short query latency. Recent results show that pipelined query processing
over a term-wise partitioned inverted index may have superior throughput.
However, the query processing latency and scalability with respect to
the collections size are the main challenges associated with this method.
In this paper, we evaluate the e ect of inverted index skipping on the
performance of pipelined query processing. Further, we introduce a novel
idea of using Max-Score pruning within pipelined query processing and
a new term assignment heuristic, partitioning by Max-Score. Our current
results indicate a signi cant improvement over the state-of-the-art
approach and lead to several further optimizations, which include dynamic
load balancing, intra-query concurrent processing and a hybrid
combination between pipelined and non-pipelined execution
Improving the Performance of Pipelined Query Processing with Skipping
Web search engines need to provide high throughput and
short query latency. Recent results show that pipelined query processing
over a term-wise partitioned inverted index may have superior throughput.
However, the query processing latency and scalability with respect to
the collections size are the main challenges associated with this method.
In this paper, we evaluate the e ect of inverted index skipping on the
performance of pipelined query processing. Further, we introduce a novel
idea of using Max-Score pruning within pipelined query processing and
a new term assignment heuristic, partitioning by Max-Score. Our current
results indicate a signi cant improvement over the state-of-the-art
approach and lead to several further optimizations, which include dynamic
load balancing, intra-query concurrent processing and a hybrid
combination between pipelined and non-pipelined execution.This is the authors' accepted and refereed manuscript to the article